A Review of Audio Features and Statistical Models Exploited for Voice Pattern Design

نویسندگان

  • Ngoc Q. K. Duong
  • Hien-Thanh Duong
چکیده

Audio fingerprinting, also named as audio hashing, has been well-known as a powerful technique to perform audio identification and synchronization. It basically involves two major steps: fingerprint (voice pattern) design and matching search. While the first step concerns the derivation of a robust and compact audio signature, the second step usually requires knowledge about database and quick-search algorithms. Though this technique offers a wide range of real-world applications, to the best of the authors’ knowledge, a comprehensive survey of existing algorithms appeared more than eight years ago. Thus, in this paper, we present a more up-to-date review and, for emphasizing on the audio signal processing aspect, we focus our state-of-the-art survey on the fingerprint design step for which various audio features and their tractable statistical models are discussed. Keywords–Voice pattern; audio identification and synchronization; spectral features; statistical models.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using Context-based Statistical Models to Promote the Quality of Voice Conversion Systems

This article aims to examine methods of optimizing GMM-based voice conversion systems performance in which GMM method is introduced as the basic method for improvement of voice conversion systems performance. In the current methods, due to using a single conversion function to convert all speech units and subsequent spectral smoothing arising from statistical averaging, we will observe quality ...

متن کامل

Identification & Detection System for Animals from their Vocalization

Until now little research has been done in the area of animal sound identification. Animal sound classification and retrieval is very helpful for bioacoustics and audio retrieval applications. This paper is a literature review of an animal identification and detection system based on animal voice pattern recognition. The system uses the ZeroCross-Rate (ZCR), Mel-Frequency Cepstral Coefficients ...

متن کامل

DNN-based Causal Voice Activity Detector

Voice Activity Detectors (VAD) are important components in audio processing algorithms. In general, VADs are two way classifiers, flagging the audio frames where we have voice activity. Most of them are based on the signal energy and build statistical models of the noise background and the speech signal. In the process of derivation, we are limited to simplified statistical models and this limi...

متن کامل

Statistical stylistics al-Hadid and at-taghabun based on Johnson

Linguists of the late twentieth century has been paying particular attention to statistical stylistics. In this type of stylistics, texts based on statistical analysis and the results of its review, the unique features and benefits of a text or author or genre counts. Among the leading theorists in this field, Johnson is the theory of biological design vocabulary, style-statistical research ...

متن کامل

The Effect of Gloss Type and Mode on Iranian EFL Learners’ Vocabulary Acquisition

Vocabulary is an important component of language proficiency which provides the basis for learners’ performance in other skills. But, since vocabulary learning seems to be so demanding, learners tend to forget newly-learnt words quite soon. In order to identify vocabulary learning conditions which can produce a more lasting effect, this study investigated the effect of three kinds of gloss cond...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1502.06811  شماره 

صفحات  -

تاریخ انتشار 2015